Who feeds the world ? And how wealthy are they ?

There are folium maps in this notebook. If they do not display well, please have a look at the .html file from the same repository.

Abstract

Our main goal in this project is to find social and economic world-wide relations of countries based on the state of their agricultural sector, using indicators such as imports, exports, production, self-sufficiency, etc. In order to find such indicators, we would like to use the data from the "Global Food & Agriculture Statistics" datasets. First, we would like to produce a map showing which countries "feed the world" i.e. which countries are net-exporting food. That map would contain a slider to show how this evolved over the past fifty years. Then we would like to show countrywise the level of food self-sufficiency i.e. the way a country does not need to trade with other ones in order to feed its population. We will also compare it to nations' economic development and see if some correlations can be found. We will visualize our best findings with several interactive maps and plots.

Research questions

We would like to work on the following research questions:

  • How does the production and consumption of food look like from a geographical point of view ? Which countries are net food exporters or importers? How did this evolve over the last few decades ?
  • What's the level of self-sufficiency in food production of individual countries and how does this change over time ?
  • Is there a link between the GDP per capita and the agricultural trade balance ? Are countries that are net exporters or importers richer ? Are self-sufficient countries richer ?
  • If we find any relations, are they also still valid if we check for specific crops ? Are there some crops that are mostly produced by richer countries, some that are mostly produced by poorer countries?

External imports:

In [1]:
import pandas as pd
import numpy as np
import os
import matplotlib.pyplot as plt
import folium
import seaborn as sns
import json
import re
import requests
from bs4 import BeautifulSoup
from ipywidgets import interact
from IPython.display import display

Auxiliary function imports:

We have implemented some functions into a dedicated module (file Milestone_2_scripts.py) in order to simplify the code and make this notebook more enjoyable to read.

In [2]:
#from Milestone_2_scripts import *

Setup:

In [3]:
data_folder_path = "./Data/current_FAO/raw_files/"

files = {"Crops production" : "Production_Crops_E_All_Data_(Normalized).csv",
         "Crops trade" : "Trade_Crops_Livestock_E_All_Data_(Normalized).csv", 
         "Consumer price indices" : "ConsumerPriceIndices_E_All_Data_(Normalized).csv",
         "Macroeconomy" : "Macro-Statistics_Key_Indicators_E_All_Data_(Normalized).csv",
         "Livestock production" : "Production_Livestock_E_All_Data_(Normalized).csv",
         "Live animals trade" : "Trade_LiveAnimals_E_All_Data_(Normalized).csv"
        }
interesting_datasets = files.keys()

1.A. Dataset description

Our main dataset would be a subset of the "Global Food & Agriculture Statistics" that is found in the proposed datasets list. In this dataset, we have seen that we could work with the production as well as import and export quantities per year and per country. We will add information about countries GDP to this database.

1.B. Loading the data set

In [4]:
def load_datasets(datasets) :
    df = {}
    for dataset in datasets :
        file_path = data_folder_path + files[dataset]
        df[dataset] = pd.read_csv(file_path, encoding = "ISO-8859-1")
    return df

We load each interresting dataset in the dictionary df :

In [5]:
df = load_datasets(interesting_datasets)

1.C. Understanding the data set

In this part, we will have a first look of the datasets in order to get a first sense of the data.

In [6]:
def display_df(df, datasets):
    for dataset in datasets :
        display(dataset, df[dataset].sample(5))

In order to see what does the datasets look like, we display a sample of 5 rows for each of them :

In [7]:
display_df(df, interesting_datasets)
'Crops production'
Area Code Area Item Code Item Element Code Element Year Code Year Unit Value Flag
1650327 216 Thailand 1814 Coarse Grain, Total 5312 Area harvested 1967 1967 ha 619560.0 A
1337236 171 Philippines 809 Manila fibre (abaca) 5419 Yield 1971 1971 hg/ha 6735.0 Fc
1175319 149 Nepal 401 Chillies and peppers, green 5510 Production 1975 1975 tonnes NaN M
307204 39 Chad 1717 Cereals,Total 5312 Area harvested 1962 1962 ha 1277450.0 A
1810409 235 Uzbekistan 260 Olives 5419 Yield 2013 2013 hg/ha 10000.0 Fc
'Crops trade'
Area Code Area Item Code Item Element Code Element Year Code Year Unit Value Flag
14264555 5802 Land Locked Developing Countries 600 Papayas 5610 Import Quantity 1999 1999 tonnes 526.0 A
10533980 213 Turkmenistan 1923 Meat Fresh+Ch+Frozen 5622 Import Value 2008 2008 1000 US$ 8565.0 A
11665676 269 EU(27)ex.int 622 Juice, fruit nes 5610 Import Quantity 2011 2011 tonnes 112731.0 A
203427 7 Angola 1934 Milk Condensed+Dry+Fresh 5922 Export Value 1974 1974 1000 US$ 42.0 A
2792742 107 Côte d'Ivoire 662 Cocoa, paste 5610 Import Quantity 2010 2010 tonnes 0.0 F
'Consumer price indices'
Area Code Area Item Code Item Months Code Months Year Code Year Unit Value Flag Note
43819 117 Republic of Korea 23012 Consumer Prices, General Indices (2010 = 100) 7005 May 2012 2012 NaN 106.247855 X 2010
3260 12 Bahamas 23012 Consumer Prices, General Indices (2010 = 100) 7005 May 2012 2012 NaN 105.798205 X 2010
32907 132 Maldives 23013 Consumer Prices, Food Indices (2010 = 100) 7004 April 2014 2014 NaN 103.914734 X 2012M6
57862 215 United Republic of Tanzania 23012 Consumer Prices, General Indices (2010 = 100) 7009 September 2014 2014 NaN 150.118900 X 2010
20857 84 Greece 23013 Consumer Prices, Food Indices (2010 = 100) 7011 November 2000 2000 NaN 73.789450 X 2009
'Macroeconomy'
Area Code Area Item Code Item Element Code Element Year Code Year Unit Value Flag
526554 251 Zambia 22015 Gross Fixed Capital Formation 6117 Share of GDP in US$, 2005 prices 1997 1997 % 1.154083e+01 Fc
248308 83 Kiribati 22015 Gross Fixed Capital Formation 6156 Annual growth Local Currency, 2005 prices 2001 2001 % -3.334630e+00 Fc
541567 5204 Central America 22008 Gross Domestic Product 6108 Value US$, 2005 prices 2012 2012 millions 1.159624e+06 A
239303 110 Japan 22075 Value Added (Total Manufacturing) 6103 Share of GDP in US$ 1983 1983 % 2.614807e+01 Fc
95230 39 Chad 22016 Value Added (Agriculture, Forestry and Fishing) 6108 Value US$, 2005 prices 1978 1978 millions 7.933683e+02 XAM
'Livestock production'
Area Code Area Item Code Item Element Code Element Year Code Year Unit Value Flag
156220 5706 European Union 1749 Sheep and Goats 5111 Stocks 1970 1970 Head 122557347.0 A
4000 9 Argentina 1068 Ducks 5112 Stocks 1964 1964 1000 Head 1550.0 F
80202 148 Nauru 1057 Chickens 5112 Stocks 1967 1967 1000 Head 2.0 F
20650 33 Canada 1034 Pigs 5111 Stocks 1992 1992 Head 10596300.0 NaN
64203 118 Kuwait 1126 Camels 5111 Stocks 1970 1970 Head 10000.0 F
'Live animals trade'
Area Code Area Item Code Item Element Code Element Year Code Year Unit Value Flag
605488 5401 Eastern Europe 946 Buffaloes 5608 Import Quantity 1968 1968 Head NaN A
340567 162 Norway 1079 Turkeys 5922 Export Value 1995 1995 1000 US$ NaN M
408765 186 Serbia and Montenegro 1884 Live Animals 5622 Import Value 1997 1997 1000 US$ 4653.0 A
162482 63 Estonia 1921 Bovine, Animals 5622 Import Value 2012 2012 1000 US$ 853.0 A
504208 237 Viet Nam 1034 Pigs 5908 Export Quantity 1966 1966 Head 4701.0 NaN

At first glance, our datasets seem very clean.

Each of our dataset contains a column "Year" and a column that is either named "Area" or "Country". This is a great news for us since we want to do a both geographical and time-related analysis.

The columns "Area" and "Country" both correspond to the country except that the "Area" may contains a group of country (e.g. "Eastern Europe").

1.D. Cleansing the data set

In this part, we will clean the datasets. The final goal is to produce one uniformized and normalized dataset on which we could work (see 1.F).

Such a cleaned dataset may look like this (in a very simplistic way):

Country | Year | GDP | Crops production | Livestock production

1.D.a. Removing unuseful data

In this section, we will create dataframes in df_useful which correspond to previous dataframes without the unuseful data.

In [8]:
df_useful = {}
1.D.a.i. Extracting GDP from the "Macroeconomy" dataset
In [9]:
def extract_GDP(df):
    def selection_GDP(df):
        return df['Item']=='Gross Domestic Product'
    def selection_US_dollars(df):
        return df['Element']=="Value US$"
    def drop_columns(df):
        dropped_colmuns = ["Item Code", "Item", "Element Code", "Element", "Flag", "Year Code", "Unit"]
        return df.drop(columns = dropped_colmuns)
    return drop_columns(df[selection_GDP(df)&selection_US_dollars(df)])
In [10]:
df_useful["GDP"] = extract_GDP(df["Macroeconomy"])
In [11]:
display(df_useful["GDP"].sample(5))
Area Code Area Year Value
550676 5305 Western Asia 1989 4.193632e+05
119027 107 Côte d'Ivoire 1989 1.071445e+04
51649 17 Bermuda 1982 1.101104e+03
556580 5404 Western Europe 1998 4.985618e+06
222435 102 Iran (Islamic Republic of) 2003 1.535448e+05
In [12]:
select_switzerland = df_useful["GDP"]['Area']=='Switzerland'
select_france = df_useful["GDP"]['Area']=='France'
select_austria = df_useful["GDP"]['Area']=='Austria'
select_canada = df_useful["GDP"]['Area']=='Canada'
ax = df_useful["GDP"][select_switzerland].plot(x ='Year', y='Value', kind = 'line')
ax = df_useful["GDP"][select_france].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = df_useful["GDP"][select_austria].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = df_useful["GDP"][select_canada].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax.legend(["Switzerland", 'France', 'Austria', "Canada"])
Out[12]:
<matplotlib.legend.Legend at 0x20228f919c8>
In [13]:
select_USSR = df_useful["GDP"]['Area']=='USSR'
select_russia = df_useful["GDP"]['Area']=='Russian Federation'
select_ukraine = df_useful["GDP"]['Area']=='Ukraine'
ax = df_useful["GDP"][select_USSR].plot(x ='Year', y='Value', kind = 'line')
ax = df_useful["GDP"][select_russia].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = df_useful["GDP"][select_ukraine].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax.legend(["USSR", 'Russia', 'Ukraine'])
Out[13]:
<matplotlib.legend.Legend at 0x2022906ebc8>
1.D.a.ii. Extracting crops harvested area, production, seed and yield from the "Crops production" dataset
In [14]:
def get_food_crops():
    #Return a list of crops categorized as food crops https://world-crops.com/food-crops/
    url="https://world-crops.com/food-crops/"
    r=requests.get(url,headers={"User-Agent": "XY"})
    soup=BeautifulSoup(r.text,'html.parser')
    elements_temp=soup.find_all('a',href=re.compile("^../"))
    elements=[el.text for el in elements_temp]
    
    #only 40 elements are displayed on each page->iterating on the total list
    for i in range(40,401,40):
        url_i=url+"?ss="+str(i)
        r=requests.get(url_i,headers={"User-Agent":"XY"})
        soup=BeautifulSoup(r.text,'html.parser')
        new_elements=soup.find_all('a',href=re.compile("^../"))
        elements+=[el.text for el in new_elements]
    return elements

def inclusive_search(string,elements):
    #returns true if the string can be found in elements. The search removes special characters from string in order to include more positive results
    string=string.lower()
    delimiters = ",", "(","&",")"," and "," "
    pattern = '|'.join(map(re.escape, delimiters))
    strings=list(filter(None,re.split(pattern,string)))
    found=False
    for s in strings:
        if s=="nes":
            continue
        for el in elements:
            found=(s in el.split())
            if found==False and s[-1]=="s":
                found=s[:-1] in el.split()
            if found==False and s[-2:]=="es":
                found=s[:-2] in el.split()
            if found==False and s[-3:]=="ies":
                found=s[:-3]+"y" in el.split()
            if found==True:
                return found
    return found


def get_food_crop_data(df):    
    #extracts the food crop data, returns 4 df: Area,Production,Seed and yield    
    df=df.copy()
    food_crops=list(map(lambda x: x.lower(),get_food_crops()))              
    crop_types_df=df[['Item','Value']].groupby('Item').sum()
    crop_types_df=crop_types_df[list(map(lambda x : inclusive_search(x,food_crops) , crop_types_df.index ))]   
    food_crop_df=df[df.Item.apply(lambda x: x in crop_types_df.index)]
    return (food_crop_df[food_crop_df.Element=='Area harvested'],
            food_crop_df[food_crop_df.Element=='Production'],
            food_crop_df[food_crop_df.Element=='Seed'],
            food_crop_df[food_crop_df.Element=='Yield'])
  
food_crop_area_df , food_crop_production_df , food_crop_seed_df , food_crop_yield_df = get_food_crop_data(df["Crops production"])
In [15]:
df_useful['Crops Area harvested'] = food_crop_area_df.drop(columns=['Item Code', "Element Code", "Element", "Year Code", "Flag"])
df_useful['Crops Production'] = food_crop_production_df.drop(columns=['Item Code', "Element Code", "Element", "Year Code", "Flag"])
df_useful['Crops Seed'] = food_crop_seed_df.drop(columns=['Item Code', "Element Code", "Element", "Year Code", "Flag"])
df_useful['Crops Yield'] =  food_crop_yield_df.drop(columns=['Item Code', "Element Code", "Element", "Year Code", "Flag"])
In [16]:
display(df_useful['Crops Area harvested'].sample(5))
display(df_useful['Crops Production'].sample(5))
display(df_useful['Crops Seed'].sample(5))
display(df_useful['Crops Yield'].sample(5))
Area Code Area Item Year Unit Value
24109 4 Algeria Garlic 1981 ha 4650.0
382407 214 China, Taiwan Province of Sweet potatoes 1995 ha 10627.0
1342135 171 Philippines Tangerines, mandarins, clementines, satsumas 1970 ha 7080.0
1705499 223 Turkey Chick peas 1995 ha 745000.0
1826059 236 Venezuela (Bolivarian Republic of) Tomatoes 2002 ha 9570.0
Area Code Area Item Year Unit Value
938865 114 Kenya Castor oil seed 1961 tonnes 3000.0
2396303 5500 Oceania Sweet potatoes 1961 tonnes 362096.0
1094431 137 Mauritius Fruit excl Melons,Total 1963 tonnes 6220.0
2176122 5300 Asia Tung nuts 1983 tonnes 368000.0
1196569 153 New Caledonia Plantains and others 2007 tonnes 477.0
Area Code Area Item Year Unit Value
2515031 5802 Land Locked Developing Countries Sweet potatoes 1969 tonnes 1650.0
191364 80 Bosnia and Herzegovina Cow peas, dry 2013 tonnes 6.0
871749 105 Israel Cereals (Rice Milled Eqv) 1964 tonnes 17466.0
2004213 5104 Southern Africa Barley 1974 tonnes 5261.0
1947669 5101 Eastern Africa Cow peas, dry 1989 tonnes 12677.0
Area Code Area Item Year Unit Value
677193 81 Ghana Papayas 2012 hg/ha 33333.0
1660678 176 Timor-Leste Mangoes, mangosteens, guavas 1996 hg/ha 63235.0
985188 121 Lebanon Peaches and nectarines 1979 hg/ha 285714.0
168718 18 Bhutan Chillies and peppers, green 1981 hg/ha 47222.0
1061594 133 Mali Millet 1999 hg/ha 8784.0
In [17]:
select_Maize = df_useful['Crops Area harvested']['Item']=='Maize'
maize_df = df_useful['Crops Area harvested'][select_Maize]

select_switzerland = maize_df['Area']=='Switzerland'
select_france = maize_df['Area']=='France'
select_austria = maize_df['Area']=='Austria'
select_canada = maize_df['Area']=='Canada'
ax = maize_df[select_switzerland].plot(x ='Year', y='Value', kind = 'line')
ax = maize_df[select_france].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = maize_df[select_austria].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = maize_df[select_canada].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax.legend(["Switzerland", 'France', 'Austria', "Canada"])
Out[17]:
<matplotlib.legend.Legend at 0x2022aa18a08>
In [18]:
select_USSR = maize_df['Area']=='USSR'
select_russia = maize_df['Area']=='Russian Federation'
select_ukraine = maize_df['Area']=='Ukraine'
ax = maize_df[select_USSR].plot(x ='Year', y='Value', kind = 'line')
ax = maize_df[select_russia].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = maize_df[select_ukraine].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax.legend(["USSR", 'Russia', 'Ukraine'])
Out[18]:
<matplotlib.legend.Legend at 0x2022ac16488>
1.D.a.iii. Extracting stocks production from the "Livestock production" dataset
In [19]:
selection_stocks = df['Livestock production']["Element"] == 'Stocks'

df_useful['Livestock production'] = df['Livestock production'][selection_stocks].drop(columns=['Item Code', "Element Code", "Element", "Year Code", "Flag"])
In [20]:
display(df_useful['Livestock production'].sample(5))
Area Code Area Item Year Unit Value
74228 137 Mauritius Ducks 1962 1000 Head 20.0
94088 179 Qatar Chickens 2004 1000 Head 4500.0
105662 25 Solomon Islands Poultry Birds 1987 1000 Head 140.0
103645 196 Seychelles Sheep and Goats 2008 Head 5200.0
18297 29 Burundi Sheep 1998 Head 123220.0
In [21]:
select_pigs = df_useful['Livestock production']['Item']=='Pigs'
pigs_df = df_useful['Livestock production'][select_pigs]

select_switzerland = pigs_df['Area']=='Switzerland'
select_france = pigs_df['Area']=='France'
select_austria = pigs_df['Area']=='Austria'
select_canada = pigs_df['Area']=='Canada'
ax = pigs_df[select_switzerland].plot(x ='Year', y='Value', kind = 'line')
ax = pigs_df[select_france].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = pigs_df[select_austria].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = pigs_df[select_canada].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax.legend(["Switzerland", 'France', 'Austria', "Canada"])
Out[21]:
<matplotlib.legend.Legend at 0x20239fecd48>
In [22]:
select_USSR = pigs_df['Area']=='USSR'
select_russia = pigs_df['Area']=='Russian Federation'
select_ukraine = pigs_df['Area']=='Ukraine'
ax = pigs_df[select_USSR].plot(x ='Year', y='Value', kind = 'line')
ax = pigs_df[select_russia].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = pigs_df[select_ukraine].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax.legend(["USSR", 'Russia', 'Ukraine'])
Out[22]:
<matplotlib.legend.Legend at 0x2022a451e88>
1.D.a.iv. Extracting import and export quantities from the "Live animals trade" and "Crops trade" datasets
In [23]:
selection_import_quantities = df['Live animals trade']["Element"] == 'Import Quantity'
selection_export_quantities = df['Live animals trade']["Element"] == 'Export Quantity'

df_useful['Live animals import quantities'] = df['Live animals trade'][selection_import_quantities].drop(columns=['Item Code', "Element Code", "Element", "Year Code", "Flag"])
df_useful['Live animals export quantities'] = df['Live animals trade'][selection_export_quantities].drop(columns=['Item Code', "Element Code", "Element", "Year Code", "Flag"])
In [24]:
display(df_useful['Live animals import quantities'].sample(5))
Area Code Area Item Year Unit Value
122635 48 Costa Rica Turkeys 2008 1000 Head 4.0
97264 351 China Buffaloes 1971 Head 3735.0
269022 123 Liberia Cattle 1983 Head 16377.0
258820 118 Kuwait Sheep and Goats 1991 Head 809207.0
137287 250 Democratic Republic of the Congo Cattle 1980 Head 90.0
In [25]:
select_pigs = df_useful['Live animals import quantities']['Item']=='Pigs'
pigs_df = df_useful['Live animals import quantities'][select_pigs]

select_switzerland = pigs_df['Area']=='Switzerland'
select_france = pigs_df['Area']=='France'
select_austria = pigs_df['Area']=='Austria'
select_canada = pigs_df['Area']=='Canada'
ax = pigs_df[select_switzerland].plot(x ='Year', y='Value', kind = 'line')
ax = pigs_df[select_france].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = pigs_df[select_austria].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = pigs_df[select_canada].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax.legend(["Switzerland", 'France', 'Austria', "Canada"])
Out[25]:
<matplotlib.legend.Legend at 0x2022a7369c8>
In [26]:
select_USSR = pigs_df['Area']=='USSR'
select_russia = pigs_df['Area']=='Russian Federation'
select_ukraine = pigs_df['Area']=='Ukraine'
ax = pigs_df[select_USSR].plot(x ='Year', y='Value', kind = 'line')
ax = pigs_df[select_russia].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = pigs_df[select_ukraine].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax.legend(["USSR", 'Russia', 'Ukraine'])
Out[26]:
<matplotlib.legend.Legend at 0x2022a53d688>
In [27]:
display(df_useful['Live animals export quantities'].sample(5))
Area Code Area Item Year Unit Value
244543 110 Japan Goats 2001 Head NaN
630434 5502 Melanesia Animals live nes 1998 Head 0.0
505884 249 Yemen Chickens 1999 1000 Head 0.0
656297 5815 Low Income Food Deficit Countries Goats 1978 Head 1480038.0
349156 165 Pakistan Horses 1979 Head 7.0
In [28]:
select_pigs = df_useful['Live animals export quantities']['Item']=='Pigs'
pigs_df = df_useful['Live animals export quantities'][select_pigs]

select_switzerland = pigs_df['Area']=='Switzerland'
select_france = pigs_df['Area']=='France'
select_austria = pigs_df['Area']=='Austria'
select_canada = pigs_df['Area']=='Canada'
ax = pigs_df[select_switzerland].plot(x ='Year', y='Value', kind = 'line')
ax = pigs_df[select_france].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = pigs_df[select_austria].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = pigs_df[select_canada].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax.legend(["Switzerland", 'France', 'Austria', "Canada"])
Out[28]:
<matplotlib.legend.Legend at 0x202297b2f48>
In [29]:
select_USSR = pigs_df['Area']=='USSR'
select_russia = pigs_df['Area']=='Russian Federation'
select_ukraine = pigs_df['Area']=='Ukraine'
ax = pigs_df[select_USSR].plot(x ='Year', y='Value', kind = 'line')
ax = pigs_df[select_russia].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = pigs_df[select_ukraine].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax.legend(["USSR", 'Russia', 'Ukraine'])
Out[29]:
<matplotlib.legend.Legend at 0x2022a220a88>
In [30]:
selection_import_quantities = df['Crops trade']["Element"] == 'Import Quantity'
selection_export_quantities = df['Crops trade']["Element"] == 'Export Quantity'

df_useful['Crops import quantities'] = df['Crops trade'][selection_import_quantities].drop(columns=['Item Code', "Element Code", "Element", "Year Code", "Flag"])
df_useful['Crops export quantities'] = df['Crops trade'][selection_export_quantities].drop(columns=['Item Code', "Element Code", "Element", "Year Code", "Flag"])
In [31]:
display(df_useful['Crops import quantities'].sample(5))
Area Code Area Item Year Unit Value
6464411 134 Malta Flour, mustard 2002 tonnes 4.0
2655928 46 Congo Tangerines, mandarins, clementines, satsumas 1989 tonnes 0.0
8613360 185 Russian Federation Potatoes 1997 tonnes 119192.0
2224009 96 China, Hong Kong SAR Flax fibre raw 1967 tonnes 0.0
9957759 211 Switzerland Rubber, natural 2006 tonnes 1955.0
In [32]:
select_Maize = df_useful['Crops import quantities']['Item']=='Maize'
maize_df = df_useful['Crops import quantities'][select_Maize]

select_switzerland = maize_df['Area']=='Switzerland'
select_france = maize_df['Area']=='France'
select_austria = maize_df['Area']=='Austria'
select_canada = maize_df['Area']=='Canada'
ax = maize_df[select_switzerland].plot(x ='Year', y='Value', kind = 'line')
ax = maize_df[select_france].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = maize_df[select_austria].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = maize_df[select_canada].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax.legend(["Switzerland", 'France', 'Austria', "Canada"])
Out[32]:
<matplotlib.legend.Legend at 0x20229f31dc8>
In [33]:
select_USSR = maize_df['Area']=='USSR'
select_russia = maize_df['Area']=='Russian Federation'
select_ukraine = maize_df['Area']=='Ukraine'
ax = maize_df[select_USSR].plot(x ='Year', y='Value', kind = 'line')
ax = maize_df[select_russia].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = maize_df[select_ukraine].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax.legend(["USSR", 'Russia', 'Ukraine'])
Out[33]:
<matplotlib.legend.Legend at 0x2022a1293c8>
In [34]:
display(df_useful['Crops export quantities'].sample(5))
Area Code Area Item Year Unit Value
1933432 33 Canada Meat, cattle, boneless (beef & veal) 1989 tonnes 51240.0
6036230 123 Liberia Milk Dry 2012 tonnes 0.0
11096178 235 Uzbekistan Poppy seed 1995 tonnes NaN
3431484 59 Egypt Coffee, extracts 1967 tonnes 0.0
12533720 5204 Central America Flour, mustard 1978 tonnes 0.0
In [35]:
select_Maize = df_useful['Crops export quantities']['Item']=='Maize'
maize_df = df_useful['Crops export quantities'][select_Maize]

select_switzerland = maize_df['Area']=='Switzerland'
select_france = maize_df['Area']=='France'
select_austria = maize_df['Area']=='Austria'
select_canada = maize_df['Area']=='Canada'
ax = maize_df[select_switzerland].plot(x ='Year', y='Value', kind = 'line')
ax = maize_df[select_france].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = maize_df[select_austria].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = maize_df[select_canada].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax.legend(["Switzerland", 'France', 'Austria', "Canada"])
Out[35]:
<matplotlib.legend.Legend at 0x2022a5ed148>
In [36]:
select_USSR = maize_df['Area']=='USSR'
select_russia = maize_df['Area']=='Russian Federation'
select_ukraine = maize_df['Area']=='Ukraine'
ax = maize_df[select_USSR].plot(x ='Year', y='Value', kind = 'line')
ax = maize_df[select_russia].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = maize_df[select_ukraine].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax.legend(["USSR", 'Russia', 'Ukraine'])
Out[36]:
<matplotlib.legend.Legend at 0x2022a560ec8>
1.D.a.v. Extracting average CPI of each year from the "Consumer price indices" dataset
In [37]:
df_useful['Consumer price indices'] =  df['Consumer price indices'][['Area',"Year",'Value']] \
                                        .dropna() \
                                        .groupby(['Area',"Year"]) \
                                        .mean() \
                                        .reset_index() \
                                        .dropna()
In [38]:
display(df_useful['Consumer price indices'].sample(5))
Area Year Value
1531 Japan 2000 97.492035
2489 Romania 2010 164.208823
1122 Gabon 2007 91.897597
1388 Iceland 2011 123.361873
3192 United Kingdom 2006 81.338998
In [39]:
select_switzerland = df_useful['Consumer price indices']['Area']=='Switzerland'
select_france = df_useful['Consumer price indices']['Area']=='France'
select_austria = df_useful['Consumer price indices']['Area']=='Austria'
select_canada = df_useful['Consumer price indices']['Area']=='Canada'
ax = df_useful['Consumer price indices'][select_switzerland].plot(x ='Year', y='Value', kind = 'line')
ax = df_useful['Consumer price indices'][select_france].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = df_useful['Consumer price indices'][select_austria].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = df_useful['Consumer price indices'][select_canada].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax.legend(["Switzerland", 'France', 'Austria', "Canada"])
Out[39]:
<matplotlib.legend.Legend at 0x2022a415b48>
In [40]:
select_russia = df_useful["Consumer price indices"]['Area']=='Russian Federation'
select_ukraine = df_useful["Consumer price indices"]['Area']=='Ukraine'
ax = df_useful["Consumer price indices"][select_russia].plot(x ='Year', y='Value', kind = 'line')
ax = df_useful["Consumer price indices"][select_ukraine].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax.legend(['Russia', 'Ukraine'])
Out[40]:
<matplotlib.legend.Legend at 0x2022a3f7048>
1.D.a.vi. Removing areas which are not countries

Having a more detailled look at the dataset, we have remarked that the areas which are real countries are exactely the ones with an "Area Code" below 5000.

In [41]:
#remove Area code >= 5000
for df_name in df_useful :
    if 'Area Code' in df_useful[df_name].keys() : 
        print ("Removing areas which are not countries in", df_name)
        selection_countries = df_useful[df_name]['Area Code']<5000
        df_useful[df_name] = df_useful[df_name][selection_countries]
        display(df_useful[df_name].sample(5))
    else :
        print (df_name, "is already clean")
Removing areas which are not countries in GDP
Area Code Area Year Value
108927 45 Comoros 1978 160.644161
127763 50 Cyprus 2011 27428.867588
368534 170 Peru 1970 5829.069768
410810 194 Saudi Arabia 1972 9664.176131
207048 93 Haiti 2012 7820.337702
Removing areas which are not countries in Crops Area harvested
Area Code Area Item Year Unit Value
418939 47 Cook Islands Citrus Fruit,Total 1985 ha 116.0
937498 114 Kenya Barley 1966 ha 8943.0
98516 11 Austria Vegetables&Melons, Total 1985 ha 12777.0
1062788 133 Mali Rice, paddy 1978 ha 111675.0
1862734 251 Zambia Maize 1992 ha 661305.0
Removing areas which are not countries in Crops Production
Area Code Area Item Year Unit Value
41829 7 Angola Tomatoes 2000 tonnes 13000.0
831626 102 Iran (Islamic Republic of) Onions, dry 1993 tonnes 957404.0
1489516 186 Serbia and Montenegro Raspberries 1995 tonnes 53084.0
725090 89 Guatemala Papayas 2002 tonnes 41000.0
572078 60 El Salvador Mangoes, mangosteens, guavas 1993 tonnes 17700.0
Removing areas which are not countries in Crops Seed
Area Code Area Item Year Unit Value
757377 93 Haiti Cow peas, dry 1966 tonnes 2340.0
203986 21 Brazil Groundnuts, with shell 1978 tonnes 11547.0
589588 238 Ethiopia Maize 2013 tonnes 63446.0
883648 106 Italy Peas, dry 1986 tonnes 3155.0
550334 58 Ecuador Wheat 1999 tonnes 3421.0
Removing areas which are not countries in Crops Yield
Area Code Area Item Year Unit Value
538674 58 Ecuador Broad beans, horse beans, dry 1972 hg/ha 7149.0
914477 110 Japan Taro (cocoyam) 1964 hg/ha 124810.0
1491850 196 Seychelles Fruit, tropical fresh nes 2002 hg/ha 53425.0
72115 10 Australia Cauliflowers and broccoli 1995 hg/ha 103887.0
1234512 159 Nigeria Millet 2012 hg/ha 9642.0
Removing areas which are not countries in Livestock production
Area Code Area Item Year Unit Value
11284 53 Benin Sheep 1999 Head 653530.0
7540 13 Bahrain Camels 1981 Head 730.0
103824 197 Sierra Leone Goats 1971 Head 105000.0
88279 166 Panama Cattle 2012 Head 1722500.0
18986 35 Cabo Verde Cattle and Buffaloes 1985 Head 10000.0
Removing areas which are not countries in Live animals import quantities
Area Code Area Item Year Unit Value
191183 84 Greece Camels 2012 Head NaN
467047 222 Tunisia Pigs 1996 Head 0.0
159434 60 El Salvador Sheep 1973 Head 0.0
489817 231 United States of America Beehives 2000 No NaN
483790 229 United Kingdom Goats 1962 Head 0.0
Removing areas which are not countries in Live animals export quantities
Area Code Area Item Year Unit Value
120686 48 Costa Rica Beehives 1967 No NaN
487194 215 United Republic of Tanzania Goats 1974 Head NaN
1199 3 Albania Animals live nes 1994 Head NaN
21208 10 Australia Goats 1991 Head 72190.0
84967 33 Canada Chickens 1970 1000 Head 2919.0
Removing areas which are not countries in Crops import quantities
Area Code Area Item Year Unit Value
1591180 27 Bulgaria Walnuts, shelled 2011 tonnes 297.0
3571560 61 Equatorial Guinea Sugar refined 1964 tonnes 700.0
8564405 183 Romania Rapeseed 2001 tonnes 161.0
4551993 90 Guinea Textile Fibres 2006 tonnes 313.0
3116915 250 Democratic Republic of the Congo Spices, nes 2006 tonnes 88.0
Removing areas which are not countries in Crops export quantities
Area Code Area Item Year Unit Value
1605123 27 Bulgaria Rape+Mustard Seed 1962 tonnes 4860.0
1887510 32 Cameroon Eggs Liquid,Dried 1965 tonnes 0.0
8416272 117 Republic of Korea Fat, nes, prepared 1967 tonnes 0.0
9268691 199 Slovakia Fruit, dried nes 2010 tonnes 1227.0
2465578 214 China, Taiwan Province of Juice, grapefruit, concentrated 1962 tonnes NaN
Consumer price indices is already clean

1.D.b. Handling of the missing data

In this section, we will explain how we will handle the missing data in previous dataframes for maps.

1.D.b.i. Highlighting the problem
In [42]:
select_USSR = df_useful["GDP"]['Area']=='USSR'
select_russia = df_useful["GDP"]['Area']=='Russian Federation'
select_ukraine = df_useful["GDP"]['Area']=='Ukraine'
ax = df_useful["GDP"][select_USSR].plot(x ='Year', y='Value', kind = 'line')
ax = df_useful["GDP"][select_russia].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax = df_useful["GDP"][select_ukraine].plot(x ='Year', y='Value', kind = 'line', ax = ax)
ax.legend(["USSR", 'Russia', 'Ukraine'])
Out[42]:
<matplotlib.legend.Legend at 0x2022a000f88>

In order to vizualize folium maps, we need to associate each country a value. The geojson file that we use is not timestamped and only countries that exist nowadays are inside it. As some countries has been dissolved during the past 50 years, our folium maps won't be complete. For instance, we do not have any value for Ukraine from 1970 to 1989. Our idea to fix this issue is presented in the next paragraph.

1.D.b.ii. Proposed correction

Our idea is to map the former country value to each of the current ones. For instance in 1982, USSR GDP is around one trillion $. Therefore, if we associate (only for folium map purposes) this value to each current country that succeeded USSR, all these countries will appear the same color in the folium map, i.e. all the USSR area will appear the same color (and the good one).

In order to do so, one need to identify which countries appeared and disappeared from the dataset and at which year. Then we will use this result along with some historical research in our visualise_world_data_folium function (1Ea).

In [43]:
countries_formation_years = {}
for country in df_useful["GDP"]["Area"].unique():
    selection = df_useful["GDP"]["Area"] == country
    year_in, year_out = df_useful["GDP"][selection].dropna()["Year"].min(), df_useful["GDP"][selection].dropna()["Year"].max()
    for year in (year_in, year_out):
        if year not in countries_formation_years :
            countries_formation_years[year] = []
    countries_formation_years[year_in].append((country,'+'))
    countries_formation_years[year_out].append((country,'-'))

countries_formation_years.pop(1970)
countries_formation_years.pop(2015)
for year in sorted(list(countries_formation_years)):
    print (year, countries_formation_years[year])
1988 [('Yemen Ar Rp', '-'), ('Yemen Dem', '-')]
1989 [('Czechoslovakia', '-'), ('Ethiopia PDR', '-'), ('USSR', '-'), ('Yemen', '+'), ('Yugoslav SFR', '-')]
1990 [('Armenia', '+'), ('Azerbaijan', '+'), ('Belarus', '+'), ('Bosnia and Herzegovina', '+'), ('Croatia', '+'), ('Czechia', '+'), ('Eritrea', '+'), ('Estonia', '+'), ('Ethiopia', '+'), ('Georgia', '+'), ('Kazakhstan', '+'), ('Kyrgyzstan', '+'), ('Latvia', '+'), ('Lithuania', '+'), ('Montenegro', '+'), ('Republic of Moldova', '+'), ('Russian Federation', '+'), ('Serbia', '+'), ('Slovakia', '+'), ('Slovenia', '+'), ('Tajikistan', '+'), ('The former Yugoslav Republic of Macedonia', '+'), ('Timor-Leste', '+'), ('Turkmenistan', '+'), ('Ukraine', '+'), ('Uzbekistan', '+')]
1999 [('Kosovo', '+')]
2005 [('Curaçao', '+'), ('Sint Maarten (Dutch Part)', '+')]
2007 [('Sudan (former)', '-')]
2008 [('South Sudan', '+'), ('Sudan', '+')]
2012 [('Netherlands Antilles (former)', '-')]

1.E. Preprocessing the data set

In this part, we will finish prepocessing the datasets. More precisely, we will deal with country names and normalizing the features.

1.E.a. Converting country names between different naming conventions

In [44]:
dic = {'Czechia': "Czech Republic",
       'Russian Federation':'Russia',
       "Serbia":"Republic of Serbia",
       'The former Yugoslav Republic of Macedonia':'Macedonia',
       'China, mainland':'China',
       'Viet Nam':'Vietnam',
       'Venezuela (Bolivarian Republic of)':'Venezuela',
       'Iran (Islamic Republic of)':'Iran',
       'Syrian Arab Republic':"Syria",
       'Bolivia (Plurinational State of)': 'Bolivia',
       "Côte d'Ivoire": "Ivory Coast",
       'Congo':"Republic of the Congo",
       "Lao People's Democratic Republic":'Laos',
       "Democratic People's Republic of Korea":"North Korea",
       'Republic of Korea':"South Korea"}

def correct_country_names(old_name):
    if old_name in dic.keys() :
        return dic[old_name]
    return old_name
In [45]:
for df_name in df_useful :
    print (df_name)
    df_useful[df_name]["Area"] = df_useful[df_name]["Area"].apply(correct_country_names)
GDP
Crops Area harvested
Crops Production
Crops Seed
Crops Yield
Livestock production
Live animals import quantities
Live animals export quantities
Crops import quantities
Crops export quantities
Consumer price indices
In [46]:
def visualise_world_data_folium(df, year, logScale=True):
    dic = {'USSR':                            ['Armenia', 'Azerbaijan','Belarus', 'Estonia', 'Georgia',
                                               'Kazakhstan', 'Kyrgyzstan', 'Latvia', 'Lithuania',
                                               'Montenegro', 'Republic of Moldova', 'Russia',
                                               'Republic of Serbia', 'Timor-Leste', 'Turkmenistan', 'Ukraine',
                                               'Uzbekistan'],
           'Ethiopia PDR':                     ['Eritrea','Ethiopia'],
           'Yugoslav SFR':                     ['Kosovo', 'Slovenia', 'Croatia',
                                                'Macedonia', 'Bosnia and Herzegovina'],
           'Yemen Dem' :                       ['Yemen'],        
           'Czechoslovakia':                   ["Czech Republic", 'Slovakia'],
           'Netherlands Antilles (former)':    ['Curaçao', 'Sint Maarten (Dutch Part)'],
           'Sudan (former)':                   ['South Sudan', 'Sudan']
          }
    def add_new_names(old_name):
        if old_name in dic.keys() :
            return dic[old_name]
        return old_name
    to_plot=df[df["Year"]==year]
    to_plot=(to_plot[['Area','Value']]
             .dropna()
             .groupby('Area')             
             .mean()
             .reset_index()
             .dropna())    
    to_plot['Area']=to_plot['Area'].apply(add_new_names)
    to_plot = to_plot.explode('Area')
    if logScale :
        to_plot.Value=np.log10(to_plot.Value)
    
    m = folium.Map(location=[40,-10],zoom_start=1.6)
    folium.Choropleth(
        geo_data=f"https://raw.githubusercontent.com/python-visualization/folium/master/examples/data/world-countries.json",
        data=to_plot,
        columns=['Area', 'Value'],
        key_on='feature.properties.name',
        fill_color='YlGn',fill_opacity=0.7,line_opacity=0.2,nan_fill_opacity=0.0
    ).add_to(m)

    folium.LayerControl().add_to(m)

    return(m)
In [47]:
display(visualise_world_data_folium(df_useful["GDP"], 1985, True))

1.E.b. Normalization and log scales

TODO, explain why (heavy tail, right skewed, power laws) + do it

For instance the distribution of GDP look a bit like a power law.

In [48]:
sns.distplot(df_useful["GDP"]["Value"], rug=False, hist=False)
C:\Users\Martin\.conda\envs\ada\lib\site-packages\statsmodels\nonparametric\kde.py:447: RuntimeWarning: invalid value encountered in greater
  X = X[np.logical_and(X > clip[0], X < clip[1])] # won't work for two columns.
C:\Users\Martin\.conda\envs\ada\lib\site-packages\statsmodels\nonparametric\kde.py:447: RuntimeWarning: invalid value encountered in less
  X = X[np.logical_and(X > clip[0], X < clip[1])] # won't work for two columns.
Out[48]:
<matplotlib.axes._subplots.AxesSubplot at 0x20229f05108>
In [49]:
#looks better with log scale
sns.distplot(np.log(df_useful["GDP"]["Value"]), rug=False, hist=False)
Out[49]:
<matplotlib.axes._subplots.AxesSubplot at 0x20229fcd748>

1.F. Making one uniformized dataframe

In this part, we will make one uniformized dataframe uni_df with the following columns.

Country | Year | GDP | Crops production columns | Livestock production columns | Crops importation columns | Livestock importation columns | Crops exportation columns | Livestock exportation | CPI

In this uniformized dataframe, a tuple (Country, Year) uniquely identifies a row.

1.F.a. Pivoting dataframes with items

In [50]:
need_pivot = ['Crops Area harvested',
              'Crops Production',
              'Crops Seed',
              'Crops Yield',
              'Livestock production',
              'Live animals import quantities',
              'Live animals export quantities',
              'Crops import quantities',
              'Crops export quantities']

def rename_columns(x, word):
    if x not in ['Area', 'Year', 'ha', 'tonnes', 'hg/ha', 'Head', '1000 Head']:
        return x + ' ' + word
    return x

df_useful['GDP'] = df_useful['GDP'].rename(columns = {'Value':'(GDP, million $)'})[["Area",'Year','(GDP, million $)']]
df_useful['Consumer price indices'] = df_useful['Consumer price indices'].rename(columns = {'Value':'(Consumer price indices, %)'})[["Area",'Year','(Consumer price indices, %)']]

for df_name in need_pivot :
    df_useful[df_name] = pd.pivot_table(df_useful[df_name], index=["Area",'Year'], columns=["Item","Unit"], values="Value").rename(columns=lambda x: rename_columns(x, df_name))
    display(df_useful[df_name].sample(5))
Item Anise, badian, fennel, coriander Crops Area harvested Apples Crops Area harvested Apricots Crops Area harvested Areca nuts Crops Area harvested Artichokes Crops Area harvested Asparagus Crops Area harvested Avocados Crops Area harvested Bambara beans Crops Area harvested Bananas Crops Area harvested Barley Crops Area harvested ... Sweet potatoes Crops Area harvested Tangerines, mandarins, clementines, satsumas Crops Area harvested Taro (cocoyam) Crops Area harvested Tomatoes Crops Area harvested Tung nuts Crops Area harvested Vegetables&Melons, Total Crops Area harvested Vetches Crops Area harvested Watermelons Crops Area harvested Wheat Crops Area harvested Yams Crops Area harvested
Unit ha ha ha ha ha ha ha ha ha ha ... ha ha ha ha ha ha ha ha ha ha
Area Year
Marshall Islands 2013 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
Portugal 1999 NaN 24300.0 657.0 NaN NaN NaN 12484.0 NaN 1324.0 24634.0 ... 2718.0 3969.0 NaN 17281.0 NaN 82418.0 NaN 204.0 220464.0 132.0
Cayman Islands 1983 NaN NaN NaN NaN NaN NaN NaN NaN 10.0 NaN ... 1.0 NaN NaN 1.0 NaN 4.0 NaN NaN NaN 3.0
Somalia 2006 NaN NaN NaN NaN NaN NaN NaN NaN 2111.0 NaN ... 750.0 NaN NaN 12000.0 NaN 23844.0 NaN 720.0 2600.0 NaN
Albania 2000 NaN 2300.0 403.0 NaN NaN NaN NaN NaN NaN 1200.0 ... NaN NaN NaN 5400.0 NaN 28090.0 6200.0 8300.0 112000.0 NaN

5 rows × 120 columns

Item Anise, badian, fennel, coriander Crops Production Apples Crops Production Apricots Crops Production Areca nuts Crops Production Artichokes Crops Production Asparagus Crops Production Avocados Crops Production Bambara beans Crops Production Bananas Crops Production Barley Crops Production ... Sweet potatoes Crops Production Tangerines, mandarins, clementines, satsumas Crops Production Taro (cocoyam) Crops Production Tomatoes Crops Production Tung nuts Crops Production Vegetables&Melons, Total Crops Production Vetches Crops Production Watermelons Crops Production Wheat Crops Production Yams Crops Production
Unit tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes ... tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes
Area Year
Faroe Islands 1984 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
Indonesia 1968 NaN NaN NaN 14000.0 NaN NaN 50000.0 NaN 920000.0 NaN ... 2364300.0 NaN NaN 15100.0 NaN 1965425.0 NaN NaN NaN NaN
Gambia 1986 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN 7400.0 NaN NaN NaN NaN
Republic of Serbia 2008 716.0 235601.0 22301.0 NaN NaN NaN NaN NaN NaN 344141.0 ... NaN NaN NaN 176501.0 NaN 1249177.0 28939.0 255984.0 2095400.0 NaN
Costa Rica 2012 NaN NaN NaN NaN 32.0 127.0 26960.0 NaN 2136437.0 NaN ... 300.0 NaN NaN 52556.0 NaN 377242.0 NaN 61628.0 NaN 29580.0

5 rows × 122 columns

Item Anise, badian, fennel, coriander Crops Seed Bambara beans Crops Seed Bananas Crops Seed Barley Crops Seed Beans, dry Crops Seed Broad beans, horse beans, dry Crops Seed Buckwheat Crops Seed Cabbages and other brassicas Crops Seed Carrots and turnips Crops Seed Cassava Crops Seed ... Sorghum Crops Seed Soybeans Crops Seed Sugar cane Crops Seed Sweet potatoes Crops Seed Taro (cocoyam) Crops Seed Vegetables&Melons, Total Crops Seed Vetches Crops Seed Watermelons Crops Seed Wheat Crops Seed Yams Crops Seed
Unit tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes ... tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes
Area Year
Australia 1967 NaN NaN NaN 94000.0 182.0 10.0 NaN NaN NaN NaN ... 1495.0 65.0 NaN NaN NaN NaN 16.0 NaN 668000.0 NaN
China, Hong Kong SAR 2008 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
Haiti 1988 NaN NaN NaN NaN 3360.0 NaN NaN NaN NaN NaN ... 2970.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN
Ethiopia PDR 1985 NaN NaN NaN 69497.0 2277.0 22415.0 NaN NaN NaN NaN ... 17161.0 27.0 NaN NaN NaN NaN 5237.0 NaN 54485.0 24500.0
Nigeria 1965 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... 92860.0 5180.0 9040.0 NaN NaN NaN NaN NaN 550.0 NaN

5 rows × 52 columns

Item Anise, badian, fennel, coriander Crops Yield Apples Crops Yield Apricots Crops Yield Areca nuts Crops Yield Artichokes Crops Yield Asparagus Crops Yield Avocados Crops Yield Bambara beans Crops Yield Bananas Crops Yield Barley Crops Yield ... Sweet potatoes Crops Yield Tangerines, mandarins, clementines, satsumas Crops Yield Taro (cocoyam) Crops Yield Tomatoes Crops Yield Tung nuts Crops Yield Vegetables&Melons, Total Crops Yield Vetches Crops Yield Watermelons Crops Yield Wheat Crops Yield Yams Crops Yield
Unit hg/ha hg/ha hg/ha hg/ha hg/ha hg/ha hg/ha hg/ha hg/ha hg/ha ... hg/ha hg/ha hg/ha hg/ha hg/ha hg/ha hg/ha hg/ha hg/ha hg/ha
Area Year
Senegal 1977 NaN NaN NaN NaN NaN NaN NaN NaN 142857.0 NaN ... 53597.0 NaN NaN 171490.0 NaN 254777.0 NaN NaN NaN NaN
French Guiana 1996 NaN NaN NaN NaN NaN NaN 36866.0 NaN 71920.0 NaN ... NaN 16190.0 53882.0 291298.0 NaN 165377.0 NaN NaN NaN NaN
Lebanon 2009 18006.0 117034.0 49153.0 NaN 93750.0 NaN 118033.0 NaN 288722.0 20625.0 ... NaN 196825.0 196548.0 627824.0 NaN 278771.0 13000.0 327980.0 27990.0 NaN
Australia 2000 6689.0 162260.0 64113.0 NaN NaN 54735.0 43141.0 NaN 191178.0 19522.0 ... 317647.0 184372.0 NaN 497016.0 NaN 227214.0 6154.0 175716.0 18209.0 NaN
Spain 1961 NaN 229464.0 105673.0 NaN 90457.0 57692.0 97143.0 NaN 331895.0 12026.0 ... 130862.0 130000.0 NaN 221915.0 NaN 175808.0 6513.0 143046.0 8837.0 NaN

5 rows × 120 columns

Item Animals live nes Livestock production Asses Livestock production Beehives Livestock production Buffaloes Livestock production Camelids, other Livestock production Camels Livestock production Cattle Livestock production Cattle and Buffaloes Livestock production Chickens Livestock production Ducks Livestock production ... Horses Livestock production Mules Livestock production Pigeons, other birds Livestock production Pigs Livestock production Poultry Birds Livestock production Rabbits and hares Livestock production Rodents, other Livestock production Sheep Livestock production Sheep and Goats Livestock production Turkeys Livestock production
Unit Head Head No Livestock production Head Head Head Head Head 1000 Head 1000 Head ... Head Head 1000 Head Head 1000 Head 1000 Head 1000 Head Head Head 1000 Head
Area Year
India 1981 NaN 1000000.0 9000000.0 67500000.0 NaN 1050000.0 188700000.0 256200000.0 197000.0 16100.0 ... 900000.0 119000.0 NaN 9600000.0 213100.0 NaN NaN 46420000.0 137420000.0 NaN
Argentina 1996 NaN 90000.0 1700000.0 NaN NaN NaN 50829700.0 50829700.0 89000.0 2200.0 ... 3300000.0 175000.0 NaN 3100000.0 94020.0 1060.0 NaN 14308000.0 17682600.0 2700.0
Morocco 1990 NaN 911892.0 546000.0 NaN NaN 33775.0 3346258.0 3346258.0 73000.0 NaN ... 193545.0 522831.0 NaN 8900.0 76100.0 NaN NaN 13514426.0 18849519.0 3100.0
South Africa 1986 NaN 210000.0 52000.0 NaN NaN NaN 12000000.0 12000000.0 38000.0 230.0 ... 230000.0 14000.0 NaN 1366000.0 38637.0 NaN NaN 29481008.0 35281008.0 295.0
Ireland 1987 NaN 16000.0 NaN NaN NaN NaN 5670300.0 5670300.0 7340.0 109.0 ... 54500.0 1300.0 NaN 980000.0 8222.0 NaN NaN 3671800.0 3671800.0 717.0

5 rows × 22 columns

Item Animals live nes Live animals import quantities Asses Live animals import quantities Beehives Live animals import quantities Bovine, Animals Live animals import quantities Buffaloes Live animals import quantities Camelids, other Live animals import quantities Camels Live animals import quantities Cattle Live animals import quantities Chickens Live animals import quantities ... Mules Live animals import quantities Pigeons, other birds Live animals import quantities Pigs Live animals import quantities Rabbits and hares Live animals import quantities Rodents, other Live animals import quantities Sheep Live animals import quantities Sheep and Goats Live animals import quantities Turkeys Live animals import quantities
Unit Head Head No Live animals import quantities Head Head Head Head Head 1000 Head Head ... Head 1000 Head Head Head 1000 Head 1000 Head Head Head 1000 Head Head
Area Year
Poland 2010 NaN 0.0 NaN 19778.0 NaN NaN NaN 19778.0 49375.0 NaN ... 0.0 0.0 NaN 2003636.0 0.0 NaN 1436.0 1436.0 11701.0 NaN
Ecuador 1967 NaN 0.0 NaN 131.0 NaN NaN NaN 131.0 309.0 NaN ... NaN NaN NaN 0.0 NaN NaN 0.0 0.0 NaN NaN
Vietnam 2010 NaN NaN NaN 10464.0 NaN NaN NaN 10464.0 1201.0 NaN ... NaN NaN NaN 796.0 NaN NaN NaN NaN NaN NaN
Cameroon 2004 NaN NaN NaN 47500.0 NaN NaN NaN 47500.0 144.0 NaN ... NaN NaN NaN 0.0 NaN NaN 4.0 4.0 0.0 NaN
Cuba 2010 NaN NaN NaN 0.0 NaN NaN NaN 0.0 0.0 NaN ... NaN NaN NaN 0.0 NaN NaN 0.0 0.0 0.0 NaN

5 rows × 24 columns

Item Animals live nes Live animals export quantities Asses Live animals export quantities Beehives Live animals export quantities Bovine, Animals Live animals export quantities Buffaloes Live animals export quantities Camelids, other Live animals export quantities Camels Live animals export quantities Cattle Live animals export quantities Chickens Live animals export quantities ... Mules Live animals export quantities Pigeons, other birds Live animals export quantities Pigs Live animals export quantities Rabbits and hares Live animals export quantities Rodents, other Live animals export quantities Sheep Live animals export quantities Sheep and Goats Live animals export quantities Turkeys Live animals export quantities
Unit Head Head No Live animals export quantities Head Head Head Head Head 1000 Head Head ... Head 1000 Head Head Head 1000 Head 1000 Head Head Head 1000 Head Head
Area Year
Mauritania 1989 NaN NaN NaN 50000.0 NaN NaN 10000.0 50000.0 NaN NaN ... NaN NaN NaN NaN NaN NaN 250000.0 400000.0 NaN NaN
Costa Rica 1973 NaN 0.0 NaN 8752.0 NaN NaN NaN 8752.0 60.0 NaN ... NaN NaN NaN 6.0 NaN NaN 0.0 0.0 NaN NaN
Swaziland 2011 NaN 0.0 NaN 1100.0 NaN NaN NaN 1100.0 0.0 NaN ... 0.0 NaN NaN 0.0 NaN NaN 0.0 176.0 0.0 NaN
Belize 2004 NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN ... NaN NaN NaN 204.0 NaN NaN NaN NaN NaN NaN
Ecuador 1980 NaN NaN NaN 0.0 NaN NaN NaN 0.0 0.0 NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

5 rows × 24 columns

Item Alfalfa meal and pellets Crops import quantities Almonds shelled Crops import quantities Animal Oil+Fat+Grs Crops import quantities Animal Vegetable Oil Crops import quantities Animal fats Crops import quantities Anise, badian, fennel, coriander Crops import quantities Apples Crops import quantities Apricots Crops import quantities Apricots, dry Crops import quantities Artichokes Crops import quantities ... Wheat+Flour,Wheat Equivalent Crops import quantities Whey, Pres+Concen Crops import quantities Whey, condensed Crops import quantities Whey, dry Crops import quantities Wine Crops import quantities Wine+Vermouth+Sim. Crops import quantities Wool, degreased Crops import quantities Wool, greasy Crops import quantities Wool, hair waste Crops import quantities Yoghurt, concentrated or not Crops import quantities
Unit tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes ... tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes
Area Year
Barbados 1987 0.0 NaN 39.0 5067.0 39.0 1.0 1010.0 NaN NaN NaN ... 24756.0 5.0 NaN 5.0 740.0 777.0 NaN NaN NaN 0.0
Grenada 1989 NaN NaN 10.0 258.0 10.0 3.0 92.0 NaN NaN NaN ... 10117.0 2.0 NaN 2.0 81.0 82.0 NaN NaN NaN NaN
Slovakia 2011 134.0 1205.0 21325.0 151123.0 21325.0 670.0 49855.0 1380.0 568.0 2.0 ... 193842.0 3680.0 NaN 3680.0 76319.0 82014.0 56.0 3277.0 1251.0 21520.0
Peru 1988 0.0 4.0 257.0 86757.0 257.0 666.0 0.0 NaN 0.0 NaN ... 931783.0 293.0 NaN 293.0 0.0 0.0 0.0 475.0 11.0 0.0
Azerbaijan 2008 NaN 178.0 1646.0 94513.0 1646.0 1.0 2155.0 1.0 48.0 NaN ... 1458837.0 0.0 NaN 0.0 301.0 4112.0 0.0 30.0 NaN 1511.0

5 rows × 454 columns

Item Alfalfa meal and pellets Crops export quantities Almonds shelled Crops export quantities Animal Oil+Fat+Grs Crops export quantities Animal Vegetable Oil Crops export quantities Animal fats Crops export quantities Anise, badian, fennel, coriander Crops export quantities Apples Crops export quantities Apricots Crops export quantities Apricots, dry Crops export quantities Artichokes Crops export quantities ... Wheat+Flour,Wheat Equivalent Crops export quantities Whey, Pres+Concen Crops export quantities Whey, condensed Crops export quantities Whey, dry Crops export quantities Wine Crops export quantities Wine+Vermouth+Sim. Crops export quantities Wool, degreased Crops export quantities Wool, greasy Crops export quantities Wool, hair waste Crops export quantities Yoghurt, concentrated or not Crops export quantities
Unit tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes ... tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes tonnes
Area Year
Iran 1977 NaN NaN 0.0 20.0 0.0 6962.0 663.0 NaN NaN NaN ... 129.0 NaN NaN NaN 14.0 15.0 417.0 0.0 10.0 NaN
Democratic Republic of the Congo 1976 NaN NaN NaN 62357.0 NaN NaN NaN NaN NaN NaN ... 85.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN NaN
Ivory Coast 1994 NaN NaN 0.0 191218.0 0.0 NaN 0.0 NaN NaN NaN ... 0.0 NaN NaN NaN 0.0 0.0 NaN NaN NaN NaN
Malta 1995 NaN 0.0 0.0 0.0 0.0 11.0 0.0 NaN NaN NaN ... 32.0 NaN NaN NaN 35.0 838.0 0.0 0.0 0.0 NaN
Maldives 1983 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

5 rows × 445 columns

In [51]:
# Deal with the NaN that appeared
for df_name in df_useful :
    for column in list(df_useful[df_name]):
        if column not in ['Area', 'Year']:
            df_useful[df_name][column].fillna(df_useful[df_name][column].median(), inplace=True)

1.F.b. Merging everything

In [52]:
uni_df = df_useful['GDP']
for df_name in need_pivot :
    uni_df = pd.merge(uni_df, df_useful[df_name], how='outer', on=['Area', 'Year'])
uni_df = pd.merge(uni_df,df_useful['Consumer price indices'], how='outer', on=['Area', 'Year'])

# Deal with the NaN that appeared
for column in list(uni_df):
    if column not in ['Area', 'Year']:
        uni_df[column].fillna(uni_df[column].median(), inplace=True)
uni_df.sample(30)
C:\Users\Martin\.conda\envs\ada\lib\site-packages\pandas\core\reshape\merge.py:617: UserWarning: merging between different levels can give an unintended result (1 levels on the left, 2 on the right)
  warnings.warn(msg, UserWarning)
Out[52]:
Area Year (GDP, million $) (Anise, badian, fennel, coriander Crops Area harvested, ha) (Apples Crops Area harvested, ha) (Apricots Crops Area harvested, ha) (Areca nuts Crops Area harvested, ha) (Artichokes Crops Area harvested, ha) (Asparagus Crops Area harvested, ha) (Avocados Crops Area harvested, ha) ... (Whey, Pres+Concen Crops export quantities, tonnes) (Whey, condensed Crops export quantities, tonnes) (Whey, dry Crops export quantities, tonnes) (Wine Crops export quantities, tonnes) (Wine+Vermouth+Sim. Crops export quantities, tonnes) (Wool, degreased Crops export quantities, tonnes) (Wool, greasy Crops export quantities, tonnes) (Wool, hair waste Crops export quantities, tonnes) (Yoghurt, concentrated or not Crops export quantities, tonnes) (Consumer price indices, %)
672 Bahrain 1998 6.997873e+03 1650.0 10960.0 3000.0 20235.0 1800.0 1404.0 1250.0 ... 5.0 0.0 8.0 0.0 0.0 83.5 142.0 38.0 30.0 97.895927
1235 Brazil 2009 1.666996e+06 1650.0 38205.0 3000.0 20235.0 1800.0 1404.0 8411.0 ... 1.0 0.0 1.0 28375.0 29062.0 0.0 8487.0 244.0 0.0 94.728103
9845 Yemen 2012 3.207477e+04 1650.0 3192.0 471.0 20235.0 1800.0 1404.0 1250.0 ... 0.0 0.0 0.0 0.0 0.0 83.5 142.0 38.0 53.0 131.360906
11130 Italy 1965 6.406727e+03 1650.0 107700.0 12000.0 20235.0 55450.0 5724.0 1250.0 ... 0.0 0.0 0.0 185547.0 241787.0 998.0 2210.0 1090.0 0.0 97.895927
9470 United Republic of Tanzania 2005 1.850985e+04 2200.0 10960.0 3000.0 20235.0 1800.0 1404.0 1250.0 ... 5.0 0.0 8.0 5.0 5.0 83.5 97.0 0.0 0.0 66.332547
680 Bahrain 2006 1.850476e+04 1650.0 10960.0 3000.0 20235.0 1800.0 1404.0 1250.0 ... 5.0 0.0 8.0 11.0 11.0 83.5 142.0 38.0 0.0 89.511213
3440 Gambia 2001 6.874094e+02 1650.0 10960.0 3000.0 20235.0 1800.0 1404.0 1250.0 ... 5.0 0.0 8.0 44.0 46.0 83.5 142.0 38.0 0.0 53.261431
11568 Norway 1962 6.406727e+03 1650.0 10960.0 3000.0 20235.0 1800.0 1404.0 1250.0 ... 0.0 0.0 0.0 0.0 0.0 90.0 1100.0 60.0 0.0 97.895927
5995 Mozambique 1980 5.730083e+03 1650.0 10960.0 3000.0 20235.0 1800.0 1404.0 1250.0 ... 5.0 0.0 8.0 0.0 0.0 83.5 142.0 38.0 0.0 97.895927
8557 Swaziland 2012 4.868482e+03 1650.0 10960.0 3000.0 20235.0 1800.0 1404.0 130.0 ... 0.0 0.0 0.0 1.0 1.0 2.0 0.0 0.0 0.0 115.590588
7438 Saint Kitts and Nevis 1997 3.568804e+02 1650.0 10960.0 3000.0 20235.0 1800.0 1404.0 1250.0 ... 5.0 0.0 8.0 44.0 46.0 83.5 142.0 38.0 0.0 97.895927
3772 Guatemala 2011 4.765467e+04 1077.0 4625.0 3000.0 20235.0 1800.0 1404.0 9590.0 ... 163.0 0.0 163.0 319.0 319.0 0.0 0.0 0.0 582.0 107.738291
10845 Gabon 1968 6.406727e+03 1650.0 10960.0 3000.0 20235.0 1800.0 1404.0 1250.0 ... 5.0 0.0 8.0 44.0 46.0 83.5 142.0 38.0 0.0 97.895927
3117 Ethiopia 2000 8.030203e+03 800.0 10960.0 3000.0 20235.0 1800.0 1404.0 9754.0 ... 5.0 0.0 8.0 20.0 20.0 83.5 142.0 38.0 0.0 33.787239
1152 Botswana 1972 1.146934e+02 1650.0 10960.0 3000.0 20235.0 1800.0 1404.0 1250.0 ... 5.0 0.0 8.0 0.0 0.0 0.0 0.0 0.0 0.0 97.895927
5790 Monaco 2005 4.202976e+03 1650.0 10960.0 3000.0 20235.0 1800.0 1404.0 1250.0 ... 5.0 0.0 8.0 44.0 46.0 83.5 142.0 38.0 0.0 97.895927
12147 Tonga 1966 6.406727e+03 1650.0 10960.0 3000.0 20235.0 1800.0 1404.0 1250.0 ... 5.0 0.0 8.0 44.0 46.0 83.5 142.0 38.0 0.0 97.895927
94 Algeria 1972 7.176428e+03 1650.0 2500.0 4300.0 20235.0 3070.0 1404.0 1250.0 ... 5.0 0.0 8.0 567004.0 581025.0 176.0 26.0 38.0 0.0 97.895927
11401 Morocco 1966 6.406727e+03 20000.0 10960.0 3000.0 20235.0 4006.0 6400.0 125.0 ... 5.0 0.0 8.0 149891.0 159373.0 411.0 471.0 0.0 0.0 97.895927
5118 Liechtenstein 1977 3.390874e+02 1650.0 10960.0 3000.0 20235.0 1800.0 1404.0 1250.0 ... 5.0 0.0 8.0 44.0 46.0 83.5 142.0 38.0 0.0 97.895927
7195 South Korea 1984 9.659793e+04 1650.0 39189.0 3000.0 20235.0 1800.0 1404.0 1250.0 ... 78.0 0.0 78.0 3.0 17.0 104.0 6.0 364.0 0.0 97.895927
3699 Grenada 1984 1.206921e+02 1650.0 10960.0 3000.0 20235.0 1800.0 1404.0 1250.0 ... 5.0 0.0 8.0 44.0 46.0 83.5 142.0 38.0 0.0 97.895927
9226 Tuvalu 1991 1.014523e+01 1650.0 10960.0 3000.0 20235.0 1800.0 1404.0 1250.0 ... 5.0 0.0 8.0 44.0 46.0 83.5 142.0 38.0 0.0 97.895927
6924 Peru 1989 3.475405e+04 1650.0 12587.0 43.0 20235.0 105.0 8256.0 7540.0 ... 5.0 0.0 8.0 123.0 123.0 17.0 200.0 237.0 0.0 97.895927
11327 Martinique 1991 6.406727e+03 1650.0 10960.0 3000.0 20235.0 1800.0 1404.0 240.0 ... 5.0 0.0 8.0 44.0 46.0 83.5 142.0 38.0 0.0 97.895927
11873 Saint Vincent and the Grenadines 1967 6.406727e+03 1650.0 10960.0 3000.0 20235.0 1800.0 1404.0 1250.0 ... 5.0 0.0 8.0 44.0 46.0 83.5 142.0 38.0 0.0 97.895927
7331 Russia 1982 6.406727e+03 1650.0 10960.0 3000.0 20235.0 1800.0 1404.0 1250.0 ... 5.0 0.0 8.0 44.0 46.0 83.5 142.0 38.0 0.0 97.895927
3773 Guatemala 2012 5.038843e+04 1100.0 6011.0 3000.0 20235.0 1800.0 1404.0 9435.0 ... 181.0 0.0 181.0 476.0 476.0 0.0 0.0 0.0 517.0 112.840219
1217 Brazil 1991 3.785813e+05 1650.0 25630.0 3000.0 20235.0 1800.0 1404.0 15402.0 ... 5.0 0.0 8.0 4325.0 4351.0 0.0 5436.0 355.0 0.0 97.895927
1170 Botswana 1990 3.721284e+03 1650.0 10960.0 3000.0 20235.0 1800.0 1404.0 1250.0 ... 5.0 0.0 8.0 78.0 79.0 2.0 2.0 0.0 0.0 97.895927

30 rows × 1387 columns

2.A.a Crops and livestock production and trade

TODO

2.A.b Introducing the concept of food self-sufficiency

In this section we will present and compute the notion of food self-sufficiency.

2.A.b.i Basic idea

One may wonder how to know whether a country produce all the food it needs or not. The notion of food-self-sufficency allows to answer to this question. More formally, it is a rate that decribes how much does a country can satisfy to meet its internal consumption needs by production. It describes the extent to which a country is able to feed its population through its domestic food production. We are interested into this measure since we think it could be correlated with the economic conditions of this country.

2.A.b.ii Formula and computation

In order to compute the food self-sufficiency, we will apply the following formula that gives us the food self-sudfficiency as a percentage :

$$\frac{Production \times 100}{Production + Imports – Exports}$$
In [53]:
all_columns = list(uni_df)
production_columns = []
import_columns = []
export_columns = []
for column in all_columns:
    if (type(column)==tuple) and column[1]=='tonnes':
        if 'export quantities' in column[0]:
            export_columns.append(column)
        elif 'import quantities' in column[0]:
            import_columns.append(column)
        elif 'Production' in column[0]:
            production_columns.append(column)
            
uni_df[('All productions','tonnes')] = 0
for column in production_columns :
    uni_df[('All productions','tonnes')] += uni_df[column]            
uni_df[('All imports','tonnes')] = 0
for column in import_columns :
    uni_df[('All imports','tonnes')] += uni_df[column]            
uni_df[('All exports','tonnes')] = 0
for column in export_columns :
    uni_df[('All exports','tonnes')] += uni_df[column]
            
uni_df[('food self-sufficiency','%')] = 100 * uni_df[('All productions','tonnes')] / (uni_df[('All productions','tonnes')]+uni_df[('All imports','tonnes')]+uni_df[('All exports','tonnes')])
In [54]:
display(uni_df[['Area','Year',('food self-sufficiency','%')]].sample(5))
Area Year (food self-sufficiency, %)
1500 Cabo Verde 1998 91.021521
12844 Cameroon 2016 91.903985
3744 Guatemala 1983 81.325173
3756 Guatemala 1995 77.142840
6924 Peru 1989 83.800515
In [55]:
plot = uni_df[['Area','Year']]
plot["Value"] = uni_df[('food self-sufficiency','%')]
for year in range(1980, 2010, 5):
    display(year, visualise_world_data_folium(plot, year, False))
    
C:\Users\Martin\.conda\envs\ada\lib\site-packages\ipykernel_launcher.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  
1980
1985
1990
1995
2000
2005

2.B. Consumer price indices

      1. what is it

      2. why do we care

TODO

2.C. Structure of international trade and historical context

Our dataset contains data for the historical period from 1970 to 2015. In order to be able to correctly interpret the results we are going to see, we first made an historical research on this period. We shortly listed below important events of this period for which we think they have had a significant influence on the agriculture and the economy.

There was the Cold war from 1945 to 1990 with two economic superpowers (USA and USSR). The USSR had been dissoluted in 1991. The Japanese economic miracle occured from 1945 to 1990 and allowed Japan to come out of the disastrous state in which it was at the exit of the WW2 and become world's second largest economy. There has been 2 big oil crisis in 1973 and 1979. There has been many wars (Middle East wars 1973-2000 e.g. Yom Kippur War 1973, Islamic Revolution in Iran 1979, Iran–Iraq war 1980-1988, Gulf war 1990-1991, Yugoslav wars 1991-2001...). We have already seen some consequences of such events by dealing with countries names in a previous section.

The third Agricultural Revolution (also known as Green revolution) occurs form 1960 to 1990 and imporved agricultural productions thanks to fertilizers and chemicals.

The following public-domain image from Wikimedia represents developed countries (blue), developing ones (orange) and least developed ones (red) according to the United Nations and International Monetary Fund. We expect to see similar results with our dataset (GDP).

The following image, also from Wikimedia shows the cumulative commercial balance for the period 1980-2008. We also expect to see similar results with our dataset, but there might be difference as we focus on agriculture.

2.D. Economic classification of countries

In [56]:
plot = uni_df[['Area','Year']]
plot["Value"] = uni_df["(GDP, million $)"]
for year in range(1980, 2015, 5):
    display(year, visualise_world_data_folium(plot, year, True))
C:\Users\Martin\.conda\envs\ada\lib\site-packages\ipykernel_launcher.py:2: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  
1980
1985
1990
1995
2000
2005
2010

4. Informed plan for next actions

Our results seem pretty intersting to share to the world. Moreover we have nice interactive maps and we would like to focus more on visual and style than writting on methodology. Therefore, we would like to produce a data story.